Calibration of agent based models for monophasic and biphasic tumour growth using approximate Bayesian computation

Agent-based models (ABMs) are readily used to capture the stochasticity in tumour evolution; however, these models are often challenging to validate with experimental measurements due to model complexity. The Voronoi cell-based model (VCBM) is an off-lattice agent-based model that captures individual cell shapes using a Voronoi tessellation and mimics the evolution of cancer cell proliferation and movement. Evidence suggests tumours can exhibit biphasic growth in vivo. To account for this phenomena, we extend the VCBM to capture the existence of two distinct growth phases. Prior work primarily focused on point estimation for the parameters without consideration of estimating uncertainty. In this paper, approximate Bayesian computation is employed to calibrate the model to in vivo measurements of breast, ovarian and pancreatic cancer. Our approach involves estimating the distribution of parameters that govern cancer cell proliferation and recovering outputs that match the experimental data. Our results show that the VCBM, and its biphasic extension, provides insight into tumour growth and quantifies uncertainty in the switching time between the two phases of the biphasic growth model. We find this approach enables precise estimates for the time taken for a daughter cell to become a mature cell. This allows us to propose future refinements to the model to improve accuracy, whilst also making conclusions about the differences in cancer cell characteristics. Supplementary Information The online version contains supplementary material available at 10.1007/s00285-024-02045-4.


Analysis of d max
In this section, we analyze the sensitivity of the model parameter d max to determine the range of the prior distribution.Figures S1-S2 depicts the distribution of distances d for the population of tumour cells based on the volume of the tumour.The d values obtained are proportional to the volume of the tumour.Thus, as the tumour expands, more cells will be located further from its edge.It appears that d remains between 0 and 30 for ranges of tumour volumes relevant to the in vivo datasets.In other words, no cell has a distance greater than about d = 30.Hence, we assign a uniform distribution constrained by 0 and 50 as the prior for d max .Figure S3 shows the effect of of different values of d max on the final tumour volume size with other model parameters are held fixed.

Synthetic datasets
In this section, we provide the detailed configuration for 5 synthetic datasets in Table S1.The visualizations of each dataset are presented in Figure S4.S1.For the first three datasets, we use BVCBM, and for the remaining two, we use VCBM.

Posterior for synthetic datasets
In this section, we show the 50%, 80% and 95% posterior predictive intervals.Our results (see Figures S6a, S7a, S8a, S9a and S10a) show that SMC-ABC can recover every synthetic dataset with reasonable accuracy, as the associated tumour volume falls within at least one of the intervals in the posterior predictive plots.The estimated univariate posterior distributions of synthetic time series (see Figures S6-S10) indicate that g age is the most informative parameter for tumour growth in the sense that the posterior is substantially more concentrated compared to the prior.However, the posteriors for p 0 and d max are not substantially different to the prior, and thus cannot be identified from the data.

Prior Predictive distributions
In this section, we plot the prior predictive distributions by drawing 1000 samples from the prior distribution and then generating simulations from the model.We plot the (0.25, 0.75), (0.1, 0.9), and (0.025, 0.975) prior predictive intervals.It is evident that most of the experimental datasets lie within the (0.1, 0.9) prior predictive interval.

Posterior for breast tumour datasets
In this section, we show the univariate and bivariate posterior plots for breast tumour datasets in Figures S11-S16.

Posterior for ovarian tumour datasets
In this section, we show the univariate and bivariate posterior plots for ovarian tumour datasets in Figures S17-S22.6 Bivariate plots for pancreatic tumour datasets In this section, we present the bivariate plots for d max and g age for the two stages, as well as for τ , the switching time, in Figures S23 -S26.

Figure S1 :
Figure S1: The distribution of distance d for the population of tumour cells based on the volume of the tumour (which is given in the title of each subplot).This is the histograms of d in different number of cells.

Figure S2 :
Figure S2: This plot illustrates the relationship between d and 1 − d/d max .As d max increases, the value of 1 − d/d max decays more slowly.It appears that d remains between 0 and 30 for tumor volumes that align with the experimental data, indicating there are no cells with a distance greater than approximately d = 30.

Figure S3 :
Figure S3: The effect of the value of d max in VCBM while other model parameters are fixed: (a) final tumour data generated from model for different scale of d max ; (b) final total number of cells generated from model for different scale of d max .

Figure S4 :
Figure S4: Synthetic tumour volume measurements.(a) shows the synthetic time series datasets 1 to 3; (b) shows the synthetic time series dataset 4 and (c) shows the synthetic time series dataset 5.The parameters used to generate each data set are provided in TableS1.For the first three datasets, we use BVCBM, and for the remaining two, we use VCBM.

Figure S5 :
Figure S5: Results for synthetic dataset 1: (a) shows the posterior predictive distribution, the green line shows the scale up posterior distribution for τ used to indicate the switching time of tumour growth, the true density for posterior τ is in (f); (b) -(f) show the marginal posterior for each of parameter, the horizontal blue lines represent the prior distribution and the vertical line in (e) and (f) refers to the "true" values of parameters.

Figure S6 :
Figure S6: Results for synthetic dataset 2: (a) shows the posterior predictive distribution, the green line shows the scale up posterior distribution for τ used to indicate the switching time of tumour growth, the true density for posterior τ is in (f); (b) -(f) show the marginal posterior for each of parameter, the horizontal blue lines represent the prior distribution and the vertical line in (e) and (f) refers to the "true" values of parameters.

Figure S7 :
Figure S7: Results for synthetic dataset 3: (a) shows the posterior predictive distribution, the green line shows the scale up posterior distribution for τ used to indicate the switching time of tumour growth, the true density for posterior τ is in (f); (b) -(f) show the marginal posterior for each of parameter, the horizontal blue lines represent the prior distribution and the vertical line in (e) and (f) refers to the "true" values of parameters.

Figure S8 :Figure S9 :
Figure S8: Results for synthetic dataset 4: (a) shows the posterior predictive distribution; (b) -(e) show the marginal posterior for each of parameter, the horizontal blue lines represent the prior distribution and the vertical line in (e) refers to the "true" values of parameters.

Figure S10 :
Figure S10: Prior predictive distributions for each of the experimental datasets.Black solid lines in (a) -(c) are breast, ovarian and pancreatic tumour datasets, respectively.The 50%, 80% and 95% prior predictive intervals are given as shaded regions on the plots, see the legend in (a).

Figure S11 :Figure S12 :
Figure S11: Marginal posterior distributions for first mouse in breast cancer dataset.The horizontal blue lines represent the prior distribution and green lines represent to marginal posterior distributions.

Figure S13 :Figure S14 :
Figure S13: Marginal posterior distributions for second mouse in breast cancer dataset.The horizontal blue lines represent the prior distribution and green lines represent to marginal posterior distributions.

Figure S15 :Figure S16 :
Figure S15: Marginal posterior distributions for third mouse in breast cancer dataset.The horizontal blue lines represent the prior distribution and green lines represent to marginal posterior distributions.

Figure S17 :Figure S18 :
Figure S17: Marginal posterior distributions for first mouse in ovarian cancer dataset.The horizontal blue lines represent the prior distribution and green lines represent to marginal posterior distributions.

Figure S19 :
Figure S19: Marginal posterior distributions for second mouse in ovarian cancer dataset.The horizontal blue lines represent the prior distribution and green lines represent to marginal posterior distributions.

Figure S20 :
FigureS20: Bivariate plot for second mouse in ovarian cancer dataset.The diagonal shows the marginal posterior distribution for each parameter, while the off-diagonal plots display the bivariate plots for each pair of parameters.

Figure S21 :
Figure S21: Marginal posterior distributions for third mouse in ovarian cancer dataset.The horizontal blue lines represent the prior distribution and green lines represent to marginal posterior distributions.

Figure S23 :Figure S24 :Figure S25 :
Figure S23: Bivariate plot for first mouse in pancreatic cancer dataset.The diagonal shows the marginal posterior distribution for each parameter, while the off-diagonal plots display the bivariate plots for each pair of parameters.

Table S1 :
Parameters used in the generation of the five synthetic datasets.For the first three datasets, we use BVCBM, and for the remaining two, we use VCBM.